System that Identifies Writers
نویسندگان
چکیده
Since the writer identification plays an important investigative and forensic role in many types of crime, various automatic techniques, feature extraction, comparison and performance evaluation methods have been studied (see [1] for the extensive survey). It has been practiced based on the hypothesis that people’s handwritings are as distinctly different from one another as their individual natures, as their own finger prints [2,3]. However, relatively little study has been carried out to demonstrate its scientific and statistical validity and reliability as forensic evidence [4] or to answer the question whether one can build a machine that can identify writers. For this reason, we present a simple model to establish the individuality of handwriting based on Hilton’s model. Hilton calculated the odds by taking the likelihood ratio statistic that is the ratio of the probability calculated on the basis of the similarities, under the assumption of identity, to the probability calculated on the basis of dissimilarities, under the assumption of non-identity [3,5]. There exist various parametric and non-parametric techniques to solve the multiple category classification problem or simply called polychotomizer where the number of classes is finite and small [6]. As the number of classes is too large to observe all (U.S. population), these techniques are of no use and the problem is seemingly insurmountable. For this reason, we suggest to transform a large and intractable polychotomizer to a simple dichotomizer, a classifier that places a pattern in one of only two categories: distance data between two writings of the same author and those of two different authors. In this model, one need not observe all classes and still allows the inferential classification. We state the problem as follows; given two randomly selected handwritten documents, the writer identification problem is to determine whether the two documents were written by the same person with two types of confusion error probabilities. To illustrate, suppose there are three writers, . Each writer provides three documents and two scalar value features extracted per document. Fig. 1 (a) shows the plot of two features from documents for every writer and Fig. 1 (b) represents the transformed plot in the two dimensional feature distance domain. Using eleven feature distance values, we trained an artificial neural network and obtained overall correctness.
منابع مشابه
Building a State-of-the-Art Grammatical Error Correction System
This paper identifies and examines the key principles underlying building a state-of-theart grammatical error correction system. We do this by analyzing the Illinois system that placed first among seventeen teams in the recent CoNLL-2013 shared task on grammatical error correction. The system focuses on five different types of errors common among non-native English writers. We describe four des...
متن کاملITRI-01-22 Rhetorical structure analysis as a method for understanding writing processes
Observation of writers producingg non-trivial texts suggests t hat most spendd time generating andd organisingg content prior to draftingg full text. Resear chsuggests a linkk between time spent inn these activities andd the quality of the resulting docume nt. Very little is known, however, about either the nature of the representation that is c reatedd during this planningg phase (the writer's...
متن کاملAutomatic Analysis of Semantic Coherence in Academic Abstracts Written in Portuguese
SciPo is a system whose ultimate goal is to support novice writers in producing academic texts in Brazilian Portuguese through presentation of critiques and suggestions. Currently, it focuses on the rhetorical structure of texts, being capable of automatically detecting and criticizing the rhetorical structure of Abstract sections. We describe a system that enhances SciPo’s functionality by eva...
متن کاملWelsh Women's Industrial Fiction 1880–1910
From the beginning of the genre, women writers have made a major contribution to the development of industrial writing. Although prevented from gaining first-hand experience of the coalface, Welsh women writers were amongst the first to try to fictionalize those heavy industries-coal and metal in the south, and slate in the north-which dominated the lives of the majority of the late nineteenth-...
متن کاملMLL3/MLL4 are required for CBP/p300 binding on enhancers and super-enhancer formation in brown adipogenesis
Histone H3K4me1/2 methyltransferases MLL3/MLL4 and H3K27 acetyltransferases CBP/p300 are major enhancer epigenomic writers. To understand how these epigenomic writers orchestrate enhancer landscapes in cell differentiation, we have profiled genomic binding of MLL4, CBP, lineage-determining transcription factors (EBF2, C/EBPβ, C/EBPα, PPARγ), coactivator MED1, RNA polymerase II, as well as epige...
متن کاملA Pattern Language for Pattern Language Structure
This paper aims to help the writers of pattern languages build better pattern languages. It focuses not on the esthetics of pattern languages, but on their structure: how patterns work together to build a system. The paper assumes that a pattern language is a designed system and, therefore, theory about system design and evolution underlies the language. In particular, the process of symmetry b...
متن کامل